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We present a search for the standard model Higgs boson produced in association with a W 
boson. This search uses data corresponding to an integrated luminosity of 7.5 fb _1 collected by 
the CDF detector at the Tevatron. We select WH —¥ Ivbb candidate events with two jets, large 
missing transverse energy, and exactly one charged lepton. We further require that at least one jet 
be identified to originate from a bottom quark. Discrimination between the signal and the large 
background is achieved through the use of a Bayesian artificial neural network. The number of 
tagged events and their distributions are consistent with the standard model expectations. We 
observe no evidence for a Higgs boson signal and set 95% C.L. upper limits on the WH production 
cross section times the branching ratio to decay to bb pairs, a(pp — s> W ± H) x B(H — > bb), relative 
to the rate predicted by the standard model. For the Higgs boson mass range of 100 GeV/c 2 to 150 
GeV/c 2 we set observed (expected) upper limits from 1.34 (1.83) to 38.8 (23.4). For 115 GeV/c 2 
the upper limit is 3.64 (2.78). The combination of the present search with an independent analysis 
that selects events with three jets yields more stringent limits ranging from 1.12 (1.79) to 34.4 (21.6) 
in the same mass range. For 115 and 125 GeV/c 2 the upper limits are 2.65 (2.60) and 4.36 (3.69), 
respectively. 

PACS numbers: 13.85. Rm, 14.80.Bn 
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The standard model (SM) describes not only the fun- 
damental particles of quarks and leptons and their inter- 
actions, but also predicts the existence of a single scalar 
particle, the Higgs boson, which arises as a result of spon- 
taneous electroweak symmetry breaking [l|-IH . The Higgs 
boson remains the only fundamental SM particle that 
has not been observed by experiment. Direct searches 
at LEP2 0], the Tevatron Q, and recently LHC experi- 
ments 0, |9[ have constrained the Higgs boson mass to lie 
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in the range between 115.5 and 127 GeV/c 2 at 95% C.L., 
which is consistent with the 95% C.L. upper limit of 152 
GeV/c 2 obtained from global fits to precision electroweak 
data [Io|. 

In s/s = 1.96 TeV proton-antiproton collisions, the 
Higgs boson is expected to be produced mainly through 
gluon fusion (gg — > H) and in association with a W or 
Z boson [ll[. The cross section for WH production is 
twice that of ZH and is about a factor of 10 smaller 
than gg — > H. The Higgs boson decay branching frac- 
tion is dominated by H — > bb for the Higgs boson mass 
m H < 135 GeV/c 2 and by H W+W~ for m H > 135 
GeV/c 2 [Hj]. A search for a low-mass Higgs (mjj < 135 
GeV/c 2 ) in the gg — > H — s- bb channel is extremely chal- 
lenging because the bb QCD production rate is many or- 
ders of magnitude larger than the Higgs boson produc- 
tion rate. Requiring the leptonic decay of the associated 
W boson improves greatly the expected signal over back- 
ground ratio in this channel. As a result, WH — > Ivbb 
is one of the most promising channels for the low-mass 
Higgs boson searches, and it significantly contributes to 
the combined search for the Higgs boson at the Teva- 
tron 0. 

This paper presents a search for Higgs boson produc- 
tion in proton-antiproton collisions using the WH — > 
Ivbb channel at ^fs = 1.96 TeV using data collected be- 
tween February 2002 and March 2011 with the CDF de- 
tector. The acquired data correspond to an integrated 
luminosity of approximately 7.5 fb _1 . Searches for the 
standard model Higgs boson using the same final state 
have been reported before by CDF [H, [3 and DO [3 
with data corresponding to an integrated luminosity of 
5.6 fb _1 and 5.3 fb _1 , respectively. Compared to the pre- 
viously reported analysis, we have employed a Bayesian 
artificial neural network (BNN) discriminant ljl □J to 
improve discrimination between signal and background. 
The signal acceptance is improved by using additional 
triggers based on jets and missing transverse energy, as 
well as a novel method to combine them into a single 
analysis stream in order to maximize the event yield 
while properly accounting for correlations between trig- 
gers. The signal acceptance is also increased by using sev- 
eral different lepton reconstruction algorithms, for muon 
and electron candidates. We have optimized 6-tagging 
algorithms used in the analysis to increase signal accep- 
tance. We also employed multivariate methods to im- 
prove the rejection of multi-jet QCD background, as well 
as to improve di-jet invariant mass resolution. 

Recently, the experiments at the Large Hadron Col- 
lider (LHC) have obtained enough data to set limits on 
the Higgs boson mass exceeding the sensitivity of the 
Tevatron experiments [HQ. However, at the LHC the 
most sensitive low-mass search is in the diphoton final 
state and searches for H — > bb will take more data be- 
fore the Tevatron sensitivity is reached in this channel. 
In this sense, the Tevatron and LHC arc complemen- 
tary and both will provide important information in the 
search for a low-mass Higgs boson. 



This paper is organized as follows. Section [TTI describes 
the experimental apparatus, the Collider Detector at Fer- 
milab (CDF). Section InTl presents the data samples and 
the event selection used to identify the WH — > ivbb can- 
didate events. Section [IVI presents the background mod- 
cling and its estimation. Section [V] discusses the signal 
acceptance and its systematic uncertainty. Section IVII 
introduces advanced techniques to improve the analysis 
sensitivity further. The final results and conclusions are 
presented in Sec. IVIII and Sec. IVIIIl 



II. THE CDF II DETECTOR 

The CDF II detector [3 geometry is described us- 
ing a cylindrical coordinate system. The z-axis follows 
the proton direction, and the polar angle 8 is usually ex- 
pressed through the pscudorapidity 77 = — ln(tan(0/2)). 
The detector is approximately symmetric around ?/ = 
and in the azimuthal angle <f>. The energy transverse to 
the beam is defined as Et = EsmO, and the momentum 
transverse to the beam is px = psin#. 

Charged particles are tracked by a system of silicon 
microstrip detectors [3 and a large open cell drift cham- 
ber [2(| in the region < 2.0 and \rj\ < 1.0, respectively. 
The tracking detectors are immersed in a 1.4 T solenoidal 
magnetic field aligned with the incoming beams, allowing 
measurement of charged particle px- 

The transverse momentum resolution is measured to 
be Spr/pr ~ 0.07% -pt(GcV/c) for the combined track- 
ing system [3- The resolution on the track impact pa- 
rameter (do), the distance from the beam-line axis to the 
track at the track's closest approach in the transverse 
plane, is cr(do) ~ 40 fxra, of which about 30 /im is due to 
the transverse size of the Tevatron beam itself [3 • 

Outside of the tracking systems and the solenoid, 
segmented calorimeters with projective tower geome- 
try are used to reconstruct electromagnetic showers and 
hadronic jets [2TM23I] over the pseudorapidity range \r]\ < 
3.6. The transverse energy is measured in each calorime- 
ter tower where the polar angle (8) is calculated using the 
measured z position of the event vertex and the tower lo- 
cation. 

Contiguous groups of calorimeter towers with sig- 
nals arc identified and summed together into an energy 
cluster. Electron candidates are identified in the cen- 
tral electromagnetic calorimeter (CEM) or in the for- 
ward, known as the plug, electromagnetic calorimeter 
(PEM) as isolated, mostly electromagnetic, clusters that 
match a reconstructed silicon track in the pseudorapid- 
ity range I77I < 1.1 and 1.1 < |r/| < 2.0, respectively. 
The electron transverse energy is reconstructed from the 
electromagnetic cluster with a precision <j(Et)/Et ~ 



13.5%/ y/E T (GeV) © 2% for central electrons [21| and 

o(Et) I E t 
trons 124 



16.0%/V£t(GcV) 8 2% for plug elec- 
Jets are identified as a group of electromag- 
netic calorimeter energy (Eem) & n d hadronic calorime- 
ter energy (Eh ad) clusters populating a cone of radius 
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AR w J{A<t>) 2 + (A77) 2 < 0.4 units around a high-S T 
seed cluster [25(. Jet energies are corrected for calorime- 
ter nonlincarity, losses in the gaps between towers, and 
multiple primary interactions. The jet energy resolution 
is approximately a{E T ) m [0.1E T + 1.0 GeV] H|. 

Muon candidates are detected in three separate sub- 
detectors. After at least five interaction lengths in the 
calorimeter, central muons first encounter four layers 
of planar drift chambers (CMU), capable of detecting 
muons with px > 1.4GeV/c [27| , Four additional layers 
of planar drift chambers (CMP) behind another 60 cm 
of steel detect muons with pt > 2.8 GeV/c [28j]. These 
two systems cover the same central pseudorapidity region 
with \rj\ < 0.6. A track that is linked to both CMU and 
CMP stubs is called a CMUP muon. Muons that exit 
the calorimeters at 0.6 < \rj\ < 1.0 are detected by the 
CMX system of four drift layers. Muon candidates are 
then identified as isolated tracks that extrapolate to line 
segments or "stubs" in the muon subdetectors. 

Missing transverse energy ( $t) is defined as the op- 
posite of the vector sum of all calorimeter tower energy 
depositions projected on the transverse plane. It is used 
as a measure of the sum of the transverse momenta of the 
particles that escape detection, most notably neutrinos. 
The corrected energies are used for jets in the vector sum 
defining The muon momentum is also added for any 
minimum ionizing high-p^ muon found in the event. 

Muon and electron candidates used in this analysis 
are identified during data taking with the CDF trig- 
ger system, a three-level filter with tracking informa- 
tion available at the first level [2{|. The first stage of 
the central electron trigger (CEM) requires a track with 
Pt > 8 GeV/c pointing to a tower with Et > 8 GeV and 
-Shad/£ , em < 0.125. As appropriate for selecting W- 
decay electrons, the plug electron trigger (MET+PEM) 
requires a tower with Et > 8 GeV, £had/-Eem < 0.125 
and the missing transverse energy fx > 15 GeV. The 
first stage of the muon trigger requires a track with 
Pt > 4 GcV/c (CMUP) or 8 GeV/c (CMX) pointing to 
a muon stub. A complete lepton reconstruction is per- 
formed online in the final trigger stage, where we require 
E T > 18 GeV for central electrons (CEM), E T > 18 GeV 
and $ T > 20 GeV for plug electron (MET+PEM) and 
Pt > 18GeV/c for muons (CMUP, CMX). 

The $t + 2 jet trigger has been previously used in 
the WH analysis [HI, which complements the high-pr 
lepton triggers by identifying a lepton from WH decay 
as a high-pr track isolated from other tracks which has 
failed the lepton triggers mentioned above. At high in- 
stantaneous Tevatron luminosity, the accept rate of this 
trigger is reduced (pre-scaled) by randomly sampling a 
luminosity-dependent fraction of events. This trigger also 
requires two jets with Et > 10 GeV, one of them central 
(| 77 1 < 1.1), and $t> 35 GeV. We also include a second 
JfjT and two-jets trigger, which was introduced only in 
the second part of the data and requires two jets with 
E T >10 GeV and $ T > 30 GeV. We also include a third 
trigger based on fx only, and ^?t>45 GeV for the first 



part of the data, while the selection criteria is relaxed to 
40 GeV for the second part of the data. 

The efficiency of the different triggers is measured us- 
ing the lepton triggered data and is parametrized using 
sigmoid turn-on curves as a function of fx, without cor- 
recting for the muon momenta. The novel method ex- 
ploited to combine and parametrize all the three trigger 
paths is described in [3Cj , which can be generalized to any 
combination of different trigger paths, allowing optimal 
performance. 



III. DATA SAMPLES AND EVENT SELECTION 

The data collected using the lepton-based (CEM, 
CMUP, CMX and MET+PEM) triggers correspond to 
7.5 ± 0.4 fb _1 of integrated luminosity, while the data 
from the $r-based triggers correspond to 7.3 ± 0.4 fb _1 . 

The WH — > ivbb signal consists of two b jets, a high-pj- 
lepton, and large missing energy. This section provides 
an overview of the signal reconstruction with a focus on 
the im prov ements of this analysis over a previous WH 
search [13j . 

A. Improving Lepton Identification 

We use several different lepton identification algo- 
rithms in order to include events from multiple trigger 
paths. Each algorithm requires a single high-py (> 20 
GeV/c), isolated charged lepton consistent with leptonic 
W boson decay. Because the lepton from a leptonic W 
decay is well-isolated from the rest of the event, the ad- 
ditional energy in the cone of AR = 0.4 surrounding the 
lepton is required to have less than 10% of the lepton 
energy. We employ the same lepton identification algo- 
rithms as the prior CDF WH search [3. The tight 
lepton is required to be identified as either an electron 
(CEM, PEM), a muon (CMUP, CMX), or an isolated 
track from the data collected with fx triggers. 

We further improve the lepton acceptance by about 
10% by including two additional lepton identification 
algorithms. One lepton type is selected from CEM- 
triggered events using a multivariate likelihood method 
to select electron candidates that fail the standard elec- 
tron requirements. Another lepton type is selected from 
^T-triggered events by requiring an isolated track with 
significant deposits of energy in the calorimeter. Such 
tracks primarily originate from the leptonic decay of the 
W boson, where the electrons fail the standard identifi- 
cation, or from t leptons that decay into single charged 
hadrons. 

The efficiency of lepton identification is measured us- 
ing Z — > e + e~ and Z — > samples. A pure sample 
of leptons is obtained by selecting events where the in- 
variant mass of two high-p^ tracks is near the mass of 
the Z boson and one track passed the trigger and tight 
lepton selection. The efficiency is then measured using 
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the other unbiased track. The same procedure is applied 
to simulated events events and a correction factor is ap- 
plied to correct the difference due to imperfect detector 
modeling. 



B. b-jet Identification 

Multijet final states have dominant contributions from 
QCD light-flavor jet production. The low-mass stan- 
dard model Higgs boson decays predominantly to 6-quark 
pairs. Jets from b quarks can be distinguished from 
light-flavor jets by looking for the decay of long-lived 
B hadrons within the jet cone. We employ three b- 
identification algorithms to optimize the selection of b- 
quark jets. The secondary vertex tagging algorithm (3l| 
(SECVTX) attempts to reconstruct a secondary vertex 
using tracks found within a jet. If a vertex is found 
and it is significantly displaced from the pp interaction 
point (primary vertex), the jet is identified as a 6-jet("6- 
tagged"). The Jet Probability algorithm [H (JP) uses 
tracking information from tracks inside a jet to identify B 
decays. The algorithm looks at the distribution of impact 
parameters for tracks inside a jet to form a probability 
that the jet originated from the primary vertex. Light 
jets yield a probability distribution approximately con- 
stant between and 1, while & jets preferentially populate 
low values of probability. A jet is considered as 6-tagged if 
the jet probability value is less than 5%. The neural net- 
work tagging algorithm [33| (NN) combines the strengths 
of existing 6-tagging information more efficiently using a 
multivariate technique exploiting variables such as dis- 
placed vertices, displaced tracks, and \ow~pt muons from 
6-quark decay The NN provides an output value ranging 
from -1 (light-jet-like) to 1 (6-jet-like). The cut on this 
continuous output has been tuned to provide maximum 
sensitivity: a jet is considered as 6-tagged if the jet's NN 
output is positive (> 0). 

To increase the signal to background ratio for WH 
events, at least one jet must be &-tagged by the SECVTX 
algorithm. We then divide our sample into four exclu- 
sive categories in a preferential order based on the purity 
of fe-tagged jets. The first category (ST+ST) comprises 
events where there are two SECVTX 6-tagged jets. The 
second category (ST+JP) consists of events where only 
one of the jets is 6-taggcd by SECVTX and the second 
jet is 6-tagged only by JP. The third category (ST+NN) 
is similar to the second, but the second jet is 6-tagged 
only by NN. The fourth category (ST) contains events 
where only one of the jets is 6-tagged by SECVTX and 
the second jet is not &-tagged. 

C. Lepton + Jets Selection 

After identifying the final state in the event, we require 
that the events contain one high-pr lepton (> 20 GeV/c), 
corrected 1^t> 20 GcV (25 GeV in the case of forward 



electrons), and two jets with corrected Et > 20 GeV and 
|?7| < 2.0. The event's primary vertex is calculated by 
fitting a subset of well-measured tracks coming from the 
beam line and is required to be within 60 cm of the center 
of the CDF II detector [l8| . The longitudinal coordinate 
zq of the lepton track at point of closest approach to the 
beam line must be within 5 cm of the primary vertex to 
ensure that the lepton and the jets come from the same 
hard interaction. In order to reduce the Z + jets and 
WW/WZ background rates, events with more than one 
lepton are rejected. Events from Z — > decays in 

which one lepton is not identified are removed by vetoing 
events where the invariant mass of the lepton and any 
track in the event is within the Z mass window between 
76 and 106 GcV/c 2 . 

Before applying any 6-tagging algorithm, the sample 
(pretag sample) has dominant contributions from W + 
jets and QCD multijet production. We use the ^-tagging 
strategics outlined above to increase the signal purity of 
the W + 2 jet events. We further purify the sample with 
exactly one secondary vertex tagged jet (ST) by apply- 
ing additional kinematic and angular cuts to reduce QCD 
multijet events that mimic the W^-boson signature. The 
rejection is based on a support vector machine multivari- 
ate discriminant that was optimized to identify the W + 
jets events against the QCD events (Hf. 

IV. BACKGROUND ESTIMATION 

The final state signature of WH — > Ivbb production 
can be mimicked by a number of processes. The domi- 
nant backgrounds are W + jets production, ti produc- 
tion, single top production, and QCD multijet produc- 
tion. Several electroweak production processes (diboson 
or Z + jets) also contribute with smaller rates. We esti- 
mate the background rates based on the same strategies 
used in the previous top cross section measurement [3l| . 
single top searches [35||, and WH analysis [l3||. We pro- 
vide an overview of each background estimate below. 

A. Top and Electroweak Backgrounds 

Production of both top-quark pairs and single top 
quarks contributes to the tagged VF+jets sample. Several 
electroweak boson production processes also contribute. 
Pairs of WW can decay to a lepton, a neutrino (seen as 
missing energy), and two jets, one of which may originate 
from a charm quark. Pairs of WZ events can decay to the 
signal Ivbb or Ivcc final state. Finally, Z — > t + t~ events 
with one leptonic r decay and one hadronic decay con- 
tribute, yielding a lepton, missing traverse energy, and a 
narrow jet displaced from the primary interaction point . 

The normalizations of the diboson and top produc- 
tion backgrounds are based on the theoretical cross sec- 
tions |36Tl38| listed in Table HI the time-integrated lumi- 
nosity, and the acceptance and ^-tagging efficiency de- 



7 



TABLE I: Theoretical cross sections and uncertainties for the 
electroweak and single top backgrounds, along with the the- 
oretical cross section for ti at m t = 172.5 GeV/c 2 . 

Background Theoretical cross sections [pb] 



WW 
WZ 
ZZ 

single-top s-channel 
single-top i-channel 

ti 



11.66 ± 0.70 
3.46 ± 0.30 
1.51 ± 0.20 
1.05 ± 0.07 
2.10 ± 0.19 
7.04 ± 0.44 



rived from Monte Carlo events. The acceptance is cor- 
rected based on measurements using data for lcpton iden- 
tification, trigger efficiencies, 6-tagging efficiencies, and 
the z vertex cut. The total top and electroweak contri- 
butions in each tagging category are shown in Table [TT1 
We use the measured inclusive cross section (787.4±85.0 
pb) for Z + jets 



B. W + heavy flavor 

The Wbb, Wcc, and Wc processes (W + heavy flavor) 
are major background sources after the 6-tagging require- 
ment. Large theoretical uncertainties exist for the overall 
normalization because current Monte Carlo event gener- 
ators can generate W+heavy-flavor events only to tree- 
level. Consequently, the rates for these processes are nor- 
malized to data. The contribution from true heavy-flavor 
production in W+jets events is determined from mea- 
surements of the heavy-flavor event fraction in W+jets 
events and the 6-tagging efficiency for those events. 

The fraction of W+jets events produced with heavy- 
flavor jets has been studied extensively using a combina- 
tion of ALPGEN + pythia Monte Carlo generators (40r - 
. Calculations of the heavy- flavor fraction in ALPGEN 
have been calibrated using a jet data sample, and a scal- 
ing factor of 1.4 ± 0.4 is necessary to make the heavy- 
flavor production in Monte Carlo match the production 
in W+l jet events. 

For the tagged W+heavy flavor (HF) background es- 
timate, the heavy-flavor fractions and tagging rates are 
multiplied by the number of pretag W+jets candidate 
events (iVpretag) in data, after correction for the contribu- 
tion of non-W (fnon-w) as determined from the fits de- 
scribed in Section HV CI . ti, and other background events 
to the pretag sample. The W+heavy flavor background 
contribution is obtained by the following relation: 



N\V+HF = /HF£tag [Aprctag(l — fn 



fnon-w) — NtoP — ^EWk] 
(1) 

where /hf is the heavy- flavor fraction, e t ag is the tag- 
ging efficiency, ATtop is the expected number of ti and 
single top events, and -/Vewk is the expected background 



contribution from WW, WZ, ZZ and Z boson events, 
as described in Section UV Al 

The total W + heavy flavor contributions in each tag- 
ging category arc shown in Table ILTl 



C. Non-W QCD Multijet 

Events from QCD multijet production may mimic 
the W-boson signature due to instrumental background. 
When a jet passes the charged lepton selection crite- 
ria or a heavy-flavor jet produces a charged leptons via 
scmileptonic decay, the jet is reconstructed incorrectly as 
a charged lepton, which is denoted as a non-W lepton. 
Non-W IpT can result from mismeasurements of energy 
or semileptonic decays of heavy-flavor quarks. Since the 
IpT mismeasurement is usually not well modeled in the 
detector simulation, we use several different samples of 
observed events to model the non-W multijet contribu- 
tion. One sample is based on events that fired the central 
electron trigger but failed at least two of the five electron 
selection identification requirements that do not depend 
on the kinematic properties of the event, such as the frac- 
tion of energy in the hadronic calorimeter. This sample 
is used to estimate the non-W contribution from CEM, 
CMUP, and CMX events. A second sample is formed 
from events that pass a generic jet trigger with transverse 
energy Et > 20 GcV to model PEM events. These jets 
are additionally required to have a fraction of energy de- 
posited in the electromagnetic calorimeter between 80% 
and 95%, and fewer than four tracks, to mimic electrons. 
A third sample, used to model the non-W background in 
isolated track events, consists of events that are required 
to pass the $t triggers and contain a muon that passes 
all identification requirements but fails the isolation re- 
quirement. 

To estimate the non-W fraction in both the pretag and 
tagged sample, the $t spectrum is fit to a sum of the 
predicted background shapes. The fit has one fixed com- 
ponent and two templates whose normalization can float. 
The fixed component is obtained by adding the contri- 
butions of the simulated processes based on theoretical 
cross sections. The two floating templates are a Monte 
Carlo W + jets template and a non-W template. The 
non-W template is different depending on the lepton cat- 
egory, as explained above. The total non-W contribution 
for each tagging category is also shown in Table HU 



D. Mistagged Jets 

Events with W + light-flavor jets containing no b or c 
quark with a fake b tag (mistags) can contribute to our 
tagged signal sample. We estimate the amount of mistags 
using the number of pretag W + light flavor events and 
the event mistag probability. The amount of pretag W 
+ light flavor is determined from the pretag sample by 
subtracting the events from non-W, top and electroweak, 



8 



TABLE II: Background summary table for each ^-tagging category after all lepton categories combined. As a reference, the 
expected signal for uin = 115 GeV/c 2 is also shown. 





D 1 Tu J- 


ST+JP 


ST+NN 


l-o ± 


Pretag events 




184050 




ft 


14z ± 11 


114 ± 12 


62.8 ± 6.4 


4(9 ± 49 


Single top (s-ch) 


40.U zh 0. i 


35.1 ± 3.4 


18.9 ± 1.8 


1UO ± 1U 


Single top (£-ch) 


1 Q n _L O A 

lo.y zh z.4 


13.3 ± 2.0 


8.7 ± 1.2 


lyl zh Z6 


W W 


1.0/ ± U.42 


6.23 ± 2.08 


5.14 ± 1.35 


IOC _l_ o c 

loo ± 2,0 


W Zi 


Iz.y zh Z.U 


10.7 ± 1.2 


5.84 ± 0.62 


coo _i_ a o 


Zj Z 


U.Oz zh u.uy 


0.49 ± 0.06 


0.29 ± 0.03 


z.Uo ± U.zo 


7 _1_ info 
Zj -\- JCtS 


Q «y1 _L_ i /in 


11.9 ± 1.7 


8.75 ± 1.30 


1 SO -1-9^ 
loZ zt ZO 


Wbb 


ocy j_ iQ4 

ZjU I 1 JL L/t: 


228 ± 91 


125 ± 50 


14^0 + ^80 


Wcc/c 


31.0 ± 12.6 


98.3 ± 40.5 


63.8 ± 26.0 


1761 =t 708 


Mistag 


12.1 ± 2.9 


52.8 ± 15.2 


57.0 ± 14.3 


1646 zb 220 


Non-W QCD 


57.9 ± 23.6 


85.3 ± 34.1 


74.9 ± 29.9 


747 zb 299 


Total background 


584 ± 169 


656 ± 194 


432 ± 126 


6802 zb 1822 


Observed events 


519 


568 


402 


6482 


WH and ZH signal (115 GeV/c 2 ) 


7.28 


5.34 


2.80 


16.0 



and W + heavy flavor contributions. The event mistag 
probability is based on the per-jet mistag matrix that is 
derived from inclusive jet data by counting the number 
of false tags per jet for each b tagger and is parametrized 
as a function of jet Et, t], number of vertices, track mul- 
tiplicity, and the scalar sum of jet Et in the event. For 
each event in our W + light flavor Monte Carlo samples, 
we apply the per-jet mistag matrix to each jet and com- 
bine the probability to get an event mistag probability. 
The total mistag contribution for each tagging category 
is also shown in Table HU 



E. Summary of Background Estimation 

The summary of the background and signal (mjj = 115 
GeV/c 2 ) estimates and the number of observed events are 
shown in Table llll for each tagging category. In this table, 
all lepton types are combined. In general, the numbers 
of expected and observed events are in good agreement. 



V. SIGNAL ACCEPTANCE 

In this section, the number of expected Higgs events 
and systematic uncertainties on the signal acceptance are 
discussed. We consider the signal acceptance for the 
WH —> ivbb process and the residual contribution of 
ZH — > l/t bb where one of the leptons fails the Z removal 
cut. We generated WH — > Ivbb and ZH — > l + l~bb sam- 
ples using the Pythia Monte Carlo program [42| for 
11 values of the SM Higgs mass sampled between 100 
and 150 GeV/c 2 . The number of expected WH — s- Ivbb 



events (iV) is given by 

N = eCa(pp -S- WH)B{H -> bb), (2) 

where e, u{pp — > WH), and B(H — > bb) are the event de- 
tection efficiency, production cross section, and branch- 
ing ratio, respectively, and C is the integrated luminosity 
of the data-taking period. The production cross section 
and branching ratio are calculated to next-to-leading or- 
der (NLO) precision [i"ll |. 

The total event detection efficiency is the product of 
several efficiencies: the trigger efficiency, the primary ver- 
tex reconstruction efficiency, the lepton identification ef- 
ficiency, the 6-tagging efficiency, and the event kinematic 
selection efficiency. The lepton trigger efficiency is mea- 
sured using a clean W — > Iv data sample, obtained from 
other triggers after applying more stringent offline cuts. 
The fx trigger efficiency is obtained using a trigger com- 
bination method (3pj |. The primary vertex efficiency is 
obtained using the vertex distribution from the minimum 
bias data. The lepton identification efficiency is calcu- 
lated using Z — > data and Monte Carlo samples. 
The 6-tag efficiency is measured in a 6-enriched sample 
from semileptonic heavy flavor decay. 

The expected number of signal events is estimated for 
each of the probed values of the Higgs boson mass. Ta- 
ble |H] shows the number of expected WH and ZH events 
for M H = 115 GcV/c 2 in 7.5 fb" 1 . 

The total systematic uncertainty on the acceptance 
comes from several sources, including trigger efficiencies, 
the jet energy scale, initial and final state radiation, lep- 
ton identification, luminosity, and 6-tagging efficiencies. 
The lepton trigger uncertainties are measured using Z 
boson decays. The acceptance uncertainty due to the jet 
energy scale (JES) [26j is calculated by shifting jet en- 
ergies in WH Monte Carlo samples by ± one standard 
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deviation. The deviation from the nominal acceptance 
is taken as the systematic uncertainty. We estimate the 
impact of changes in initial state radiation (ISR) and fi- 
nal state radiation (FSR) by halving and doubling the 
parameters related to initial and final state radiation in 
the Monte Carlo event generation j43[. The difference 
from the nominal acceptance is taken as the systematic 
uncertainty. The uncertainty in the incoming partons' 
energies relies on the parton distribution function (PDF) 
fits. A NLO version of the PDFs, CTEQ6M, provides a 
90% confidence interval for each of the eigenvector input 
parameters [Hj]. The nominal PDF value is reweighted 
to have a 90% confidence level value, and the correspond- 
ing reweighted acceptance is computed. The differences 
between the nominal and the reweighted acceptance are 
added in quadrature, and the total is assigned as the sys- 
tematic uncertainty |3l| . 

The lepton identification uncertainties are estimated 
based on studies comparing Z — > l + l~ events in data 
and Monte Carlo. 

The systematic uncertainty of 6% in the CDF luminos- 
ity measurement is treated as fully correlated between the 
signal and all Monte Carlo based background samples. 

The systematic uncertainty on the event tagging ef- 
ficiency is estimated by varying the 6-tagging efficiency 
and mistag prediction by ± one standard deviation and 
calculating the difference between the shifted acceptance 
and the default one. 

Total systematic uncertainties are summarized in Ta- 
bles ED GY1 and El 



VI. ANALYSIS OPTIMIZATION 

In this section we discuss the analysis optimization pro- 
cedure after the event selection. 



A. fo-jet Energy Correction 

The dijet invariant mass provides discrimination be- 
tween signal and background and is a critical variable 
used in the multivariate analysis as described below. Im- 
provement of the dijet mass resolution directly results 
in an improvement of the WH signal sensitivity. To 
improve dijet invariant mass resolution, we developed a 
neural network 6-jet energy correction method. The neu- 
ral network was trained using a sample of Monte Carlo 
simulated WH — > Ivbb events. During training, jet ob- 
servables were used as input values, and the energy of 
the corresponding b quark was used as the target value. 

For each jet, we studied 40 variables related to the 
calorimeter energy, the charged tracks, and the displaced 
vertices within the jet cone of 0.4 and converged on nine 
well-modeled input variables most optimal for the jet- 
encrgy correction. The four calorimeter variables chosen 
are the jet Et before and after the standard jet correc- 
tion, jet pt, and jet transverse mass. The tracking vari- 



ables chosen are the sum pt and the maximum px of 
the set of tracks within the jet cone. For the jet tagged 
by SECVTX we also include the vertex variables such as 
the secondary vertex transverse decay length, its uncer- 
tainty, and fitted secondary vertex px- Further details 
can be found in Ref. [5i|. Without (with) applying NN 
corrections to 6-jets in the Higgs decays, the dijet mass 
resolution is ~15% (~11%) for double-tagged events, and 
~17% (~13%) for single-tagged events. 



B. Bayesian Neural Network Discriminant 

To improve the signal-to-background discrimination 
further, we employed a Bayesian neural network (BNN) 
trained on a variety of kinematic variables to distinguish 
WH events from the background [H[i3- For this analy- 
sis, we employ distinct BNN discriminant functions that 
were optimized separately for the different tagging cate- 
gories and each Higgs boson mass in order to maximize 
the sensitivity. 

The BNN configuration has N input variables, 2N hid- 
den nodes, and one output node. The input variables 
were selected by an iterative BNN optimization proce- 
dure from a large number of possible variables. The op- 
timization procedure identified the most sensitive one- 
variable BNN, then looped over all remaining variables 
and found the most sensitive two- variable BNN. The pro- 
cess continued until adding a new variable no longer im- 
proved the sensitivity. The discriminant then is used to 
do hypothesis testing of a WH signal in the simulated 
data as a function of Higgs mass, which improves the 
background rejection with a sensitivity gain of 25% com- 
pared to the most sensitive variable alone. 

The discriminant used for the ST+ST tag category is 
trained using N=7 input variables. The most sensitive 
variable is Mjj, the invariant mass calculated from the 
two tight jets after using the neural-nctwork-based jet 
energy correction as described in Sec. IVI Al The second 
input variable is the pt imbalance, which is the difference 
between the scalar sum of the pr of all measured objects 
and the $ T , prQetl) + Pr(jet2) + p T (lep) - $t- The 
third variable, MJ^j X , is the invariant mass of the lepton, 
IpT, and one of the two jets, where the jet is chosen to 
give the maximum invariant mass. The fourth variable 
is Qiep x ?/Zep, the signed product of the electric charge 
times the rj of the charged lepton. The fifth variable is 
Y.Et (loose jets), which is the scalar sum of loose jets 
transverse energy. A loose jet is defined as a jet hav- 
ing \rj\ < 2.4, E T > 12 GeV, but failing the tight-jet 
requirement (E T > 20 GeV and \rj\ < 2.0). The sixth 
variable is the pr of the reconstructed W . The last vari- 
able is Ht, the scalar sum of the event transverse energies 
H T = YiEt (jets) + pt (lepton) + $t- 

The discriminant used for both the ST+JP and 
ST+NN tag categories is trained with the same input 
variables as the ST+ST category, except that the vari- 
able M^ x is replaced with M™ m and the px imbal- 
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TABLE III: Systematic uncertainties on the acceptance for central leptons (in percent). 

Category JES ISR/FSR/PDF Lepton ID Trigger 6-tag Total 

ST+ST 2.0 4.9 2 < 1 8.6 10.3 

ST+JP 2.8 4.9 2 < 1 8.1 10.1 

ST+NN 2.2 7.7 2 < 1 13.6 15.9 

1-ST 2.3 3.0 2 < 1 4.3 6.1 



TABLE IV: Systematic uncertainties on the acceptance for forward electrons (in percent). 



Category JES ISR/FSR/PDF Lepton ID 


Trig 


;ger 


6-tag 


Total 


ST+ST 2.4 


7.7 2 


< 


1 


8.6 


12.0 


ST+JP 3.9 


4.5 2 


< 


1 


8.1 


10.3 


ST+NN 6.7 


12.9 2 


< 


1 


13.6 


20.0 


1-ST 2.9 


5.7 2 


< 


1 


4.3 


8.0 



ance is replaced with the I^t- The discriminant used for 
the single ST tag category is trained with the same in- 
put variables as the ST+ST category with the exception 
that M™? x is replaced by fix and an extra variable is 
added. The new variable is the output of an artificial- 
neural-network-based heavy flavor separator trained to 
distinguish 6-quark jets from the charm and light fla- 
vor jets after SECVTX tagging [35j | . Distributions of 
all these variables are checked for both the pretag and 
tagged sample to ensure that they are described well by 
the background model. 

The training is defined such that the neural network 
attempts to produce an output as close to 1.0 as possible 
for the Higgs boson signal events and as close to 0.0 as 
possible for background events. Figure Q] shows a shape 
comparison of the BNN output between signal and back- 
ground events for the ST+ST, ST+JP, ST+NN, and ST 
sample, respectively. 



VII. RESULTS 

We perform a direct search for an excess in the sig- 
nal region of the BNN output distribution from double- 
tagged and single-tagged W + 2 jets events. Figure [2] 
shows the BNN output distributions for each 6-tagging 
category. The data and background predictions are in 
good agreement. 

We use a binned likelihood fit [lH H(| to the observed 
BNN output distributions to test the presence of a WH 
signal. For optimal sensitivity, we perform a simultane- 
ous search in each 6-tag and lepton category. The to- 
tal likelihood is the product of the single Poisson likeli- 
hoods used in each independent sample. The likelihood 
fit accommodates the uncertainties on our background 
estimate by letting the overall background prediction 
float within Gaussian constraints. The systematic un- 
certainties associated with the shape of the BNN output 



due to JES uncertainty are also included for both signal 
and background. We use a different set of background 
and signal BNN template shapes for each combination 
of lepton type and tag category. We correlate the sys- 
tematic uncertainties appropriately across different lep- 
ton types and tag categories. We find no evidence for 
a Higgs boson signal in our sample. We use Bayesian 
limits with a positive flat prior and set 95% C.L. upper 
limits on the WH cross section times branching ratio, 
a(pp — > W ± H) ■ B(H — > 66), relative to the rate pre- 
dicted by the standard model. 

We compare our observed limits to our expected sen- 
sitivity by generating statistical trials according to the 
background-only model and analyzing them as our data. 
The combined expected and observed limits for all the 
lepton types are shown in Figure |3] and Table IVII Limits 
are also determined for the combination of this analysis 
with the independent WH search using a matrix element 
technique for events with three jets [14| . The luminosity 
used in the three jet analysis is 5.6 fb -1 . The combi- 
nation improves the expected WH sensitivity by about 
5% over the W + two jet result alone. The observed 
limits in the two jet channel in the mass range above 
nriH > 110 GeV/c 2 are one standard deviation higher 
than expected. After combining with three jet bin, our 
limits become closer to the expectation. 

This WH(ZH) — ¥ £v(£jt)bb analysis represents a sub- 
stantial improvement in sensitivity over the prior analysis 
using a neural network [l3j |. The increase in sensitivity 
is 25% at rriH = 115 GeV/c 2 in addition to the improve- 
ment from a larger sample size, and is mainly due to 
the improvement of analysis techniques that include the 
BNN discriminant, the 6-jct energy correction, the addi- 
tional fx triggers, the loose leptons, and the optimized 
6-tagging strategies. 
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TABLE V: Systematic uncertainties on the acceptance for additional leptons (in percent). 



Category JES ISR/FSR/PDF Lepton 


ID Trigger 


b-tag 


Total 


ST+ST 1.7 


7.1 


4.5 


3.0 


8.6 


12.5 


ST+JP 2.4 


6.4 


4.5 


3.0 


8.1 


11.9 


ST+NN 1.9 


19.5 


4.5 


3.0 


13.6 


24.5 


1-ST 4.7 


8.4 


4.5 


3.0 


4.3 


11.8 





FIG. 1: Comparison of the BNN output for signal (Mh = 115 GeV/c 2 ) and background events with all lepton types included. 
From (a) - (d) the fe-tag categories are ST+ST, ST+JP, ST+NN, and ST, respectively. Signal and background histograms are 
each normalized to unit area. The W H and ZH signals peak near the 1.0 value. The QCD multijet, top quark, W + jets and 
Z + jets peak near the 0.0 value. The diboson background has a broad peak in the middle region as its kinematics is very close 
to the signal ones. The diboson spike in figure (c) is a statistical fluctuation. 
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BNN output (M = 115 GcV/c 3 ) BNN output (M H = 115 GcV/c 2 ) 




10" 



I d) 1-ST 


— - CDF Data 




J Diboson (4%) 




□ Top (12*) 




| | Wbb(18%) 




| | W/Z+jets (54%) 




| | Multijets (12%) 




WHxlO 




0.1 0.2 0.3 


0.4 0.5 0.6 0.7 0.8 0.9 




BNN output (M H =115 GeV/c 2 ) 



FIG. 2: The observed data and predicted BNN output for signal (Mh = 115 GeV/c 2 ) and background events with all leptons 
included. From (a) - (d) the 6-tag categories are ST+ST, ST+JP, ST+NN, and ST, respectively. 



TABLE VI: Observed and expected upper limits at 95% C.L. normalized to the SM expectation on a(pp — > W H) x B(H — » bb) 
as a function of Higgs mass, including all lepton and tag categories, in the presented analysis and after combination with an 
independent search using a matrix element analysis for events with three jets. 

Upper Limits/SM for Combined Lepton and Tag Categories 
W + 2 jets W + 2,3 jets 



itlh (GeV/c 2 ) Observed Expected Observed Expected 



100 


1.34 


1.83 


1.12 


1.79 


105 


2.10 


2.08 


2.06 


1.98 


110 


3.42 


2.26 


2.78 


2.17 


115 


3.64 


2.78 


2.65 


2.60 


120 


4.68 


3.22 


3.40 


3.06 


125 


5.84 


4.01 


4.36 


3.69 


130 


8.65 


5.13 


6.09 


4.80 


135 


10.2 


7.02 


7.71 


6.40 


140 


16.4 


9.39 


12.3 


8.84 


145 


24.7 


15.3 


18.9 


14.2 


150 


38.8 


23.4 


34.4 


21.6 
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Higgs Mass (GeV/c 2 ) Higgs Mass (GeV/c 2 ) 

FIG. 3: Observed and expected upper limits at 95% C.L. on Higgs boson production times branching ratio with respect to the 
SM expectation for all lepton and tag categories combined as a function of the Higgs boson mass for the present analysis (a) 
and after combination with the independent three jet analysis with a matrix element (b). 



VIII. CONCLUSIONS 

We have presented the results of a CDF search for the 
standard model Higgs boson decaying to bb final states, 
produced in association with a W boson decaying into 
a charged lepton and neutrino. We find that for the 
dataset corresponding to an integrated luminosity of 7.5 
fb -1 , the data agree with the SM background predic- 
tions. We therefore set upper limits on the Higgs boson 
production cross section times the H — > bb branching ra- 
tio with respect to the standard model prediction. For 
the mass range of 100 GeV/c 2 through 150 GeV/c 2 we 
set observed (expected) upper limits at 95% C.L. from 
1.34 (1.83) to 38.8 (23.4). For 115 GeV/c 2 the upper 
limit is 3.64 (2.78). When we combine this search with 
an independent search using events with three jets fl4j . 
we set more stringent limits in the same mass range from 
1.12 (1.79) to 34.4 (21.6). For 115 and 125 GeV/c 2 the 
upper limits are 2.65 (2.60) and 4.36 (3.69), respectively. 
Improved analysis techniques have resulted in an increase 
in sensitivity over the previous 2.7 fb _1 analysis [l3| by 
25% more than the expectation from simple luminosity 
scaling. 

The search results in this channel at the CDF experi- 
ment are the most sensitive low-mass Higgs boson search 
at the Tevatron. While the LHC experiments will con- 



tinue to improve their sensitivity to the low-mass Higgs 
boson, which is obtained primarily from searches in the 
diphoton final state, we expect that the searches in the 
H — > bb channel at the Tevatron will provide a crucial 
test on the existence and nature of the low-mass Higgs 
boson. 
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